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1 ) Summary 



I propose implementing a virtual address space of 2t24 words at the 
instruction level. All language systems can use this virtual address space. 
Mapping virtual addresses into physical addresses is done, in the most frequent 
case, with a small set of non-associative mapping registers. 



2) Addressing 



Programs run in a 2t24 bit virtual address space. I'll discuss addressing 
this space in terms of a Nova-like instruction set not because this is optimal 
but because its a concrete place to start.. 



2.1 Instruction Format: 



An instruction which references memory has: 

I - an indirect bit 

X - a base register field (2 bits) 

D - a displacement (8 bits) 

Z.2 Registers: 



All registers dealing with addresses (o.g. PC and index registers) are at 
least 24 bits wide (32 if that proves useful for other reasons). 



2.3 Indirection: 



Indirccted instructions are pre-indexed, not post indexed and the address 
obtained at the 1st level is interpreted as the first word of a 2 word block 
containing a 24 bit virtual address. Hence, any word in memory can bo 
addressed. 
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2.4 Base Register Field Interpretation : 



"OLation:^-?-^ 



00 - PC 01,10,11 - AC1,AC2,AC3 

2.5 Comment: 

The proposed scheme works somewhat better if more base registers are used, 
e.g. or 10. A modified instruction format is then necessary. 

3) Mapping Virtual Addresses onto Physical Addresses 

3.1 Observation: 

The only addresses that can bo accessed are those obtained by Indirection 
or thru the base registers. Locality considerations argue that the latter will 
be the most provelant mode. 

3.2 The Idea: 

When a program is making repeated references thru base register X to a 
virtual page VP, it will got into some core page CP. We introduce hardware 
which for each base register X gives the current core page CP[X]. Instructions 
which use base register X are (generally) mapped into core page CP[X]. 

3.3 Details: 

Each of the 4 base registers has associated with it two other registers: 
last virtual page number (LVP) and core page (CP). V/hencver base register X is 
used to form. an address, the following occurs: 

virtual address = D '+ contents of X 

virtual page # = high-order 15 bits of virtual address 

displacement = low-order 9 bits of virtual address 

If virtual page § = LVP[X] 

then memory address = <CP[X], displacement 

els e base register fault. 

In the successful case, mapping requires one comparison and no associative 
hardware. Indirection is treated as an implicit baso registor; tho 24 bit 
virtual address is compared against LVP[I], etc. 

3.4 Base Register Faults: 

A base register fault is caused either because (0 + contents of X) is on a 
different page than contents of X or because the contents of X has been changed 
since the last memory reference thru X. When a base register fault occurs, a 
hashing scheme like Peter's ("Iho Lisp Alto Map", 5/30/70) is used to find the 
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page if 1t is in core. (If nil languages use the hash mechanism, there 1s 
economic incentive to shift more of the work into hardware, e.g. put the hash 
table into fast memory - around 2 1 1 3 bits for a half-full table.) If the page 
is not in core, the processor delivers an interrupt;^ it's up to the language to 
decide what pngc(s) should bo shoved out. 



3.5 Pago Size: 



?^s^3 o^ i?? u. 




Displacement addressing (2t3) and page size need not be identical but 
should be comparable. I suggest 2t9. 



3.6. Generalization 



The pair LVP,CP provides a fast way of mapping a single page. If we are 
willing to complicate everything somewhat,, this can be generalized to a set of 
pages. Define a . pjuio group to be a contiguous set of virtual pages residing in 
a contiguous set of core pages. Associate with each base register CP as above 
and lower-upper virtual bounds registers LVP and UVP which span a page group. 
CP[X] serves as a base for the core address 1f the LVP[X] < virtual page it < 
UVP[X]. 

This has the advantage of allowing chunks of memory to be mapped directly, 
.up to all of core. For example, a program which is preplanned to run in no 
more than G4K of the address space can set up a single page group of that size. 



This has the following disadvantages: 



(n) additional hardware, since testing virtual page # 
against LVP, UVP requires 2 subtractions (in parallel) instead of one 
comparison and requires forming the core page # by addition instead 
of concatination. 



(b) a moro complicated core manager, since 1t is necessary 
to allocate variable size core chunks. 



(c) somewhat more complex communication between the program 
and the swapper, since the program must be able to specify segments." 



A ) Some Usa/j _e Modes 



4.1 Within a Module: 



Code and data can be accessed as D\ + contents of PC and will generally be 
mapped directly. jf<^ W-Wc* 



p^> 



4.2 Cross-Module Code Linkage: 



Indirect - requiring 4H bits 1n all. 
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4.3 Local Variables: 

Using one of the base registers as a frame pointer, local variables can be 
accessed as D + contents of frarno register and will generally be mapped 
directly. 

4.4 Lisp CAR and CDR: 

Use Danny's hash-linking scheme. Local pointers (non-linked) are converted 
into virtual addresses whon put into a base register by adding to them the 
contents of the base register which' bases their page . Linked pointers arc 
full virtual addresses as in Danny's scheme. 

4.5 Array processing: 

Sequential accesses to anything, arrays in particular, are generally mapped 
directly. 

5) Drawbacks 

5.1. Indirection takes 2 words rather than 1 on the present address ^ 4t *^Lj 
structure. l Q 

5.2. Unless the page group mechanism is implemented and employed, addresses 
generated in a random pattern within a core working . set (e.g. tree-sorting a 
32K word array) must go through the hash lookups- since the base register 
mapping hardware will do little good. 

5.3. Since the display hardware runs in the physical address space and the 
program runs in a virtual address space, it is necessary to be able to 
establish a correspondence between them. This is an understood problem (Tenex) 
with a reasonable solution (locked pages). Page groups (c.f. 3.6) help here. 
However, implementing locked pages is still a complication. 

5.4. BCPL is complicated since full addresses become 24 bits, while 
integers probably should remain 16 bits. Making the distinction would, at 
the least, introduce some complication with BCPL. 

6) O peratin g System 

6.1. Since the virtual memory 1s large, the operating system can live 1n 
the same address space. 

6.2. Protection of the operating system from the program or of the program 
from regions oT itself is a separate issue. A writo-protect bit for each of 
the core page registers (c.f., 3.3) can bo included, if memory protection seems 
needed. 
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1 ) Summary 

I propose that a virtual address space of 2 1 24 words bo Implemented at tho 

instruction and micro-Instruction level. All language systems can use this 

virtual address space. Happing virtual addresses Into physical addresses is 

done, in the most frequent case, with a small sot of non-associative mapping 

'registers. 

2) Addressing 

Programs run 1n a 2t24 bit virtual address space. I'll discuss addressing 
this space 1n terms of a Nova-Uke instruction sot because its a concrete 
place to start. 

2.1 Instruction Format: 

An instruction which references memory has: 

I - an indirect bit 

X - a base register field (2 bits) 

D - a displacement (8 bits) 

2.2 Registers: 

All registers dealing with addresses (e.g. PC and Index registers) ore at 
least 24 bits wide. 

2.3 Indirection: 

Indlrected instructions are pre-1ndexed, not post Indexed and the address 
obtained at the 1st level 1s Interpreted as the first word of a 2 word block 
containing a 24 bit virtual address. Hence, any word 1n memory can be 
addressed. 



Alto Virtual Memory June 18, 1974 Pago 2 

Proposal - Version 2 
Ben Wegbreit 

2.4 Base Register F1old Interpretation: 

00 - a special 24-b1t register GR which can bo loaded by special 
Instructions. 

01 - PC 

10,11 - AC2,AC3 

2.5. Data width: 

Since addresses are 24 bits wide, this 1s the standard width of full 
pointers. Integers should probably remain 16 bits wide. Short pointers (o.g. 
relative to the first word of their page) should also bo supported. Registers 
dealing with addressing are 24 bits wide. Hence 1t is necessary to be able to 
load and store both 16 and 24 bit quantities Into and from these registers. 

2.6: Non-Nova Instructions: 

Mesa and Altolisp will be programmed mainly by micro-Interpreters for their 
specialized instruction set. Hence, the virtual address space must bo 
available at the microinstruction level. In that mode, it may bo possible to 
use more base registers, e.g. 8 or 16, which would mako the scheme work 
somewhat better. The number of registers is limited by bits to address them 1n 
micro-instructions and card capacity. 

3) Mapping Virtual Addresses onto Physical Addresses 

3.1. Observation 

Addresses typically are not generated at random but rather ore obtained by 
relatively small changes to a prior address. Examples: 

(a) small change to current PC for next Instruction and local jumps, 

(b) accessing local variables as a small offset from a frame pointer, 

(c) incrementing an Index register 1n fetching consecutive words of on 
array, 

(d) in L1sp, car and cdr pointers local to a segment 

These prior addresses are often obtained from the registers: PC, and AC'S. 

3.2. The Idea 

Supply with every memory request an Indication (access code) of which 
register was used 1n generating the address. If the new address is close 
enough to the prior address, then 1t will probably be in core already. We 
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Introduce hardware which for each base reglstor X glvos the current core page 
CL[X]. Addresses generated from base register X are generally mapped by CL[X]. 

Generalize this as follows. Define a pag e group to be a continuous sot of 
virtual pages residing 1n a contiguous set of coro pagos. The program 
specifies a page group to the page manager by a JSYS-11ke command. Whenovor 
any page 1n the group 1s subsequently referenced, the entire group 1s obtained. 
CL maps an entire page group. 

3.3. Details: 

Each of the base registers has associated with 1t three other registers 

LVP - low virtual page 
HVP - high virtual page 
CL - low core page 

LVP[X] and HVP[X] hold the page number of the low and high pages in the last 
page group accessed thru base register X. Whenever base register X 1s used to 
form an address, the following occurs: 

virtual address = D + contents of X 

virtual page # = high order 15 bits of virtual address 

if LVP[X] < virtual page # < HVP[X] 

then memory address = CL[X] + virtual address 

else base register fault 

Actually, (CL[X] + virtual address) is formed first, then memory fetch is 
initialized, then the comparison is carried out to see 1f the word which will 
be fetched is the right one. Hence, in the successful, case, mapping delays the 
memory access by one add time. 

Indirection 1s treated as an implicit base register; the 24 bit virtual 
address is compared against LVP[I], etc. 

3.4 Base Register Faults: 

A base register fault 1s caused either because (D + contents of X) is in a 
different page group than contents of X or because the contents of X has been 
changed since the last memory reference thru X. V/hen a base register fault 
occurs, a hashing scheme like Peter's ("The L1sp Alto Hap", 5/30/70) 1s used to 
find the page if it is in core. (If all languages use the hash mechanism, there 
is economic incentive to shift more of the work Into hardware, e.g. put the 
hash table into fast memory. If the page is not 1n core, the processor 
delivers an interrupt; It's up to the language to decide what page(s) should be 
shoved out. 

3.5 Page Size: 

Displacement addressing (2t8) and page size need not be Identical but 
should be comparable. I suggest 2t9. 
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3.6. Non-Nova Instructions: 

In non-Nova node, the correspondence between base register and 24-bit 
virtual address must be done at the micro-instruction level, e.g. via some bits 
of the micro-instruction, or via the contents of somo machine register. 

3.7 Disk management: 

It is desirable to have page groups bo identical with partitions . That 1s, 
given the address of the first page 1n a group, on the disk, the addresses of 
the other pages be determined by a simple computation and pages should be 
placed so as to minimize time to read them all in. 

4 ) Some Usage Modes 

4.1 VMthln a Module: 

Code and data can be accessed as + contents of PC and will generally bo 
mapped directly. 

4.2 Cross-Module Code Linkage: 
Indirect - requiring 48 bits 1n all. 

4.3 Local Variables: 

Using one of tho base registers as a frame pointer, local variables can be 
accessed as D * contents of frame register, and will generally be mappod 
directly. 

4.4 L1sp CAR and CDR: 

Use Danny's hash-linking scheme. Local pointers (non-linked) are converted 
into virtual addresses when put into a base register by adding to them the 
contents of the base register which bases their page . Linked pointers are 
full virtual addresses as in Danny's scheme. 

4.5 Array processing: 

Sequential accesses to anything, arrays In particular, are genorally mapped 
directly. 

4.6. Small programs: 

A program which is preplanned to run in no more than 64K of the address 
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space can set up a single page group of that size and have all memory accesses 
mapped directly. 

4.7. Bit map for the display: 

A page group. 

5) Drawbacks 

5.1. Indirection takes 2 words rather than 1 on the present address 
structure. 



5.2. BCPL 1s complicated since full addresses become 24 bits, while 
integers probably should remain 16 bits. Making the distinction would 
introduce substantial complication with BCPL. (c.f. 2.4) 

5.3. Additional hardware 1s required to make the mapping fast enough to be 
acceptable. 

5.4. When a base register fault occurs, tho mapping registers must bo 
loaded after hash-table lookup. Hence, 1n the worst case, if every referonce 
faulted, the proposed scheme would run slower than using hashing alone. 

6) Operating System 

6.1. Since the virtual memory is large, the operating system can live 1n 
the same address space. 

6.2. Protection of the operating system from tho program or of the program 
from regions of Itself 1s desirable, but a separable issue. Using this 
proposal, it suffices to add Tenex-style bits for read, write, and exocuto 
access to the mapping registers (c.f. 3.3) and maintain them in the hash table 
(c.f. 3.4) to be loaded on base register fault. 

7) Separation 

There are roughly six Ideas combined here, pulling them apart may bo 
helpful : 

1) large address space (greater than 2tl6) 

2) access code to specify which mapping register will probably map tho 
address. 

3) Peter's hash table when (2) gets a fault 
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4) page groups to got: each napping roglster to map several pages, 
preloading and oscapo back to the bare addressing structure. 

5) correspondence between physical disk allocation and page groups to 
speed up access to all of group 

6) access protection 
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2.3. Manipulating Base Registers 

Micro-instructions ore provided for loading hase registers and their 
associated napping registers (c.f. Section 3.T). mldlllonolly. tho following 
nay l»e useful: incrcnentlng. adding lC-b1t displacements, and storing. 



3) tl.tppinn Virtual Addresses onto Physical A ddresses 



J ) Surn.-iry 

It is proposed that a virtual address spaco of 2124 words bo Inplcnonted at 
th«s nicro- instruct Ion level. All language systens can uso this virtual address 
space. Hopping virtual addresses Into physical addresses 1s dono. In tho most 
frequent case, with a snail set of non-assoclat Ivc napping reglstors. 

2) A>idr«.-,».in«i 

Protons run In a 2t24 word virtual address spaco. I'll discuss addressing 
this spao at th« nlcro- Instruction levul. 

2.1. Base Registers 

Scne nunher (B or 10) of 24-blt base registers ore oddod to the Alto (on 
the nenory interface board). 

2.2. Address Fornatlon 
Addresses are of two sorts: 



r 



3.1 The Ideo When an address Is forned as the displacenent fron a baso 
ro-iisl.T, IT the new address Is close enough to the prior contents of the baso 
register then it will probably be In core already. We introduce hardware which 
for each base register X arc generally napped by Cl[X]. 

tie <,<*'*«■*■ ""*'/>-} ' C l[*].AJ.I*"*---' i' J -t "'— V" ••.7*"* X 

Cnneralizo this as follows. Define a [Mir orou;; to be a continuous set of 
virlu.il pa-ios residing in « contiguous set ol core panes. The pron.-.-.n 
specifics a pane group to the page nanaoer by. a JSYS-like connand. Whenever 
any pane in the group is subsequently referenced, the entire group is obtained. 
CL naps an entire page group. 

3.2. Details: 

E«ich of the base registers has associated with it a block of three other 
registers, n.ipjjlng, re gisters . 

LVP - low virtual page 
tlVP - high virtual page 
CL - low coro page 



in the List 
to 



(a) norn.il 'node: 
registers. 



16-btt displacenent fron the contents of ono of tho baso 



LVP[X] and HVP[X) hold the page nunber of the low and hi,ih pages in the i 
p.vio group accessed thru base register X. Whenever base register X is used 
forn an address, tho following occurs: 

virtual address • P ♦ contents or X 

virtual page • • high order 15 bits of virtual address 

if LVP[-XT. < virtual page • < HVP[X] 

then nenory address • CL[X] ♦ virtual address 

p ise base register fault 



(b) full address: a 24-blt full address 



In r..vrr..il node, a nlcro- Instruct Ion referencing nenory supplies a 16-btt 
disp l/.cf-r.r-nt as the output of the ALU and a 3 or 4 bit field (specifying baso 
r'.'j'it'.T imr>ii':r) In tho nlcro- Instruct Ion . Tho nlcro- instruct Ion Mold Is ORcri 
witn the contents of a special loadable S register to get tho base rogistcr 
nur.t<cr. The displacenent Is added to tho contents of the baso register to got 
the virtual address. 



Actually. (Cl[X] ♦ virtual address) is forned first, then nenory fetch is 
initiated. Hence, in the successful case, napping delays the nenory access t.y 
one a. 1. 1 tine. Tho conpanson is done last and 1s actually performed nuc 
sinply - Chuck Thacker worked out a nethod that requires only one conpanson 
for bounds. 

3.3. Base Register Faults 

A base register fault Is caused cither because (P ♦ contents of V) is in a 
different page group than contents of X or because the contents of X has been 
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Ciunici) since tho last ncnory reference thru X. When a baso register fault 
occurs, a hashing schono I H.e Peter's ("Tho Lisp Alto Map", 5/30/70) Is used to 
find the page if It Is In core. Call this a hashed nop. (If oil languages use 
the hash ncchanisn. there is ocononlc Incentive to shift noro of tho work Into 
hard •■•.■re, e.g. hardwaro hash. IX the page is not In core, tho processor 
d'.livf-fi an interrupt; it's up to the language to doctdo what pago(s) should bo 
shoved out. 



3.4 Page Sire 



isplacenent addressing (2«8) and page size nood not bo Idontlcal but 
should be conparable. I suggest 2t9. 



3.5. disk nanagenent 



It is desirable to havo page groups bo identical with pa rt It Ions . That is, 
given the address of tho first pago in a group on tho disk, tho addrossos of 
tnc other pages be dctcrnincd by a stnple conputation and pagos should bo 
placid so as to nininije ttno to read then all In. 



3.6 Full Addresses 



A full 24-bit address con bo delivered with a baso register nunbor X. to bo 
interpreted as: nap the address with the bounds registers of X. This 'Is 
intended to provide a way of using a previously sot up baso register as o hint 
for napping a full address without affecting Us contents. Its utility Is 
sorievhat narglnal. 



I ) p i'.r.u'.', long of Mapp Ing 



The napping hardware proposed here works well only If nost nonory 
references are issued as displacements fron previously loaded baso registers 
end no'A of these are napped directly. The hash table lookup takes around 2.0 
ns in the best ease (with current hardware). A ratio of around 10:1 rtenory 
refer rr,r.<-, to hashed nap references is required If this is to bo practical. 
Howard itur'jls, Peter (/outsell, and Jin Mitchell aro collecting statistics for 
the d/narnc behavior of CCPL. Cytollsp. and Hesa. 

These statistics provide nenory references noro or less unanblguously, but 
the nu».t,er of hashed nap references depends on how tho lnnguago processors 'uso 
the (••lie registers. In this regard, thero aro throo classos of nonory 
references: 

(1) Easily reprsentcd as base register ♦dlsplaccnent, sinco tho baso register 
ontuir, s a fixed. Identifiable conponcnt of tho pscudo-nachino supporting tho 
language, o.g. PC, stack polnter(s). global polntor(s). 



(2) possibly represented as base register ♦ dlsplaccnent, dopondlng on tho 
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conpi lotion of the language, e.g. array access, accesses to several fields of a 
pointer-based structure cars and cdrs on the sane page. 



(3) not represented as baso register ♦' dlsplaccnent, 
procedure in another segncnt, a newly constructed pointer. 



e.g. address of a 



Class (3) mist be hash napped, class (1) will be directly napped in nost 
Inplenontat ions: class (2) presents an uncertainty. Use of base registers here 
is smilar to the technical problens of usinn index roviisters *ell in norn.il 
conpi Lit ion; the Incentive for doing so is substantially greater. It appears 
that this will be easier to do for array processing (e.g. in Mesa) than for 
list processing (e.g. in Dyloltsp) since noro is done in-lino (procedure calls 
arc usually treated as destroying state infornat ion) . 



5) Sono Usage Modes 



5.1 Within a Module: 



Code and data can bo accessed as ♦ contents of PC and will tie napped 
directly. 



5.2 Cross-Module Code Linkage: 

Requires a full 24-bit pointer and a hashed nap. 

5.3 Local Variables: 



Using one of the baso registers as a frano pointer, local variables can be 
accessed as ♦ contents of frano register and will to napped directly. 

5.4 Lisp CAR and Ct'R: 

Use t'anny's hash-linking schene. Local pointers (non-linked) are converted 
into, virtual addresses when put Into a base register by adding to then the 
contents of the base register which bases their pago . Linked pointers are 
full virtual addresses as in Danny's schene. 

5.5 Array processing: 

Sequential accesses to anything, arrays In particular, can bo napped 
directly, but this depends on the language processor (c.f. Section 4). 



5.0. Snail prograns: 
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A aronnn which Is preplanned to run In no nore than G4K of tho address 
s,,ce .can «t up a single page group of that sU. and have all ncnory accesses 
napped directly. 

3.7. ett nap for th« display: 

A page group. 



0) Nova Enulatlon Mode 



It is desirable to let existing Nova nachino language proprons rim 
unehiU «<i .l» to allow use of the 2-24 word address 1f wanted. tho 
following is a conpronlse. 

0.1. Instruction Fornat 

An Instruction which references ncnory has: 

I • an Indirect bit 

X - a base register Mold (2 bits) 

- a dlsplacenont (8 bits) 

0.2 Pegistcrs: 

All registers dealing with addresses (e.g. PC and Index registers) ore 24 
bits wide. 

0.3 Case Register Field Interpretation: 

00 - a special 24-b1t register CR which can bo loaded by special 
instructions. 

01 - PC 

10.11 - AC2.AC3 



0.4. Data width: 



„ddre*s*s are 24 bits wide, this is the standard width of full 




0.5. Indirection 



Inducted instructions are Interpreted at currently on tho Alto. Honco, 
only 2 • 10 words can be accessed this way. 



0.0. Register Overflow 
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•'•■'listers used in Nova enulatlon node hhlch overflow 10 bits will behave 
differently. Sorry. 

0.7. BCPL CCPL is conplicated since full addresses hoconp 24 lul$, 

while integers probably should rcnaln 10 bits, flaking the distinction would 
require changes to tho BCPL conpiler and would n.ike the language into a non- 
standard dialect. 



7) 2n. r ir-i'. t J n . r J §y_st^n 



7.1. Since the virtual ncnory 1s large, the operating systen can live in 
the sane address spaco. 

7.2. Protection of the operating systrn fron the progran or of the nro'Viin 
fron regions of Itself Is desirable, for the usual reasons. I'sing t'lis' 
pi'opns.i 1 , it suffices to add lenex-style bits for read, write, and execute 
occoss to the napping registers (c.f. 3.2) and naintain then in the hash tabic 
(c.f. 3.3) to be loaded on base register fault. 



C ) S-' p.irat inn 



llierc aro roughly six ideas conbined here, pulling then apart nay :-c 
helpful : 



1) large address space (greater than 2»10) 



2) access code to specify which napping register will probably nap tho 
address. 



3) Peter's hash table when (2) gets a fault 



4) page groups to get: each napping register to nap several pages, 
preloading and oscapo back to the bare addressing structure. 



5) correspondence between physical disk allocation and page groups to 
speed up access to all of group 



0) access protection 



0) Unresolved Questions 



9.1. Usago 
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(a) How effectively can the language processors use the baso registers for 
class (<) references? 

(b) What prjrccnt of ncnory references will bo napped dlroctly7 
9.2. Design Paranotors 

(a) Is the full 24-blt address worth providing? 

(ti) How extensive should the arlthnotlc operations on base roglstors bo7 

(c) Horf nany base registers should thero bo? 

(d) Should the 10-bit dlsplacenont be treated as a slgnod-lntcgor (sign bit 
extended) to give relative addressing to etthor sido of tho baso reg1stor7 
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Ijl scope 

. This is an improvement Co the design in "Alto Virtual Memory Update", 
Sept. 13, 1974. A subsequent memo will consolidate the two previous memos with 
this one to produce a final design. 

2_._ SllMMARY 

A proposed reorganization oT the base registers, along with a small amount 
of additional hardware, can be used to reduce the address translation fault 
fate by approximately a factor of two. 

hi. ™L JPJ-AS 

In the previous design, a single base register was associated with each 
logical function. The new proposed design allows multiple (N=2) base registers 
to bo associated with a logical function. Of the pair of registers, one is 
current at any given time while the other serves as an al ternate to be tried, 
under appropriate circunstnnr.es, when the current one is incorrect. State 
information in the memory 'interface determines which one of a pair is deemed to 
be current and is kept updated so that successive references to the same page 
group through a pair try the current one first. 

As an example, consider the planned design for Mesa ("The Implementation 
of Mesa on Alto", 8/21/74). Faults can be caused by 

(1) user-computed addresses 

(2) non-local procedure calls 

The faults for each of these may be decreased as follows: 

(3.1) User-computed Addresses 

The previous design uses one base register for each of read and write. 
The new proposed design uses a pair of base registers for each of read and 
write: a current register and an alternate. Consider reads; writes are 
analogous. When a vo<u\ request occurs in 24-bit addressing mode, then the 
current read register is tried first; if it maps successfully, we're done. If 
it. does not map successfully, then the al.ternate is tried. If it succeeds, 
then it becomes the new current road register; this realizes MRU trial of the 
registers in the read pair. If the alternate fails, then a fault occurs and, 
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in processing the fault, the (old) alternate is reloaded; this implements LRU 
replacement... To distinguish this mode of addressing from the 24-bit mode 
described in the previous memo, call this n Ci-m odo memo ry reference . 

Trying the alternate base register takes 2 minor cycles. Loading a base 
register takes about 25. Hence trying the alternate is cost effective in time 
if the chance of success on the alternate (given that the current has failed) 
is better than 2/25 = .00. Measurements for Mesa give probabilities of .47, 
.08, and .44 (three different programs); BCPL gives .63. That is, trying the 
alternate is a clear winner in time. 

(3.2) Transfers of Control 

Consider transferring control from procedure V to procedure Q. Three 
cases arise: 

(1) Q is known to be in the same page group; example: Mesa intra-module 
call 

(2) Q is known to bo in a different page group; example: Mesa inter- 
module call. 

(3) uncertainty; example': Mesa procedure-valued parameters; also: 
Bytelisp without; block compilation. 

Consider imitating (3.1) and always trying a current base register for 
control and, failing that, trying an alternate. The results would be: 

(1) succeeds on current 

(2) fails on cur-rent; may succeed on alternate 

(3) may succeed on either current or alternate 

Some measurements of Mesa programs show that with this policy, for three 
different programs, the percent of control transfers mapped by current are: 
40.9, 28.4, and 40.7 while the percent mapped by the alternate arc: 40.0, 
69.1, and 39.6 respectively. Note that the sums of current plus alternate are 
somewhat more consistent: 00.9, 97.5, and 00.3 respectively. (It is 
conjectured that the somewhat anomalous behavior of the second program - a text 
formatter - is caused by very frequent calls out of "current" module to the 
string -manipulation module). 

This differs Somewhat from G-niodc in that to try the current base register 
and then the alternate requi res. address translation, but not actually going to 
the memory. If a fault docs not occur, then the contents of MAR after 
translation is the now value of V for the (possibly now) current base register 
for control. This should be loaded into P(control-current) to produce the new 
PC base. 

(3.3) Other Applications of Using Base Registers in Pairs 

The above two examples both involve translation of full 24-bit virtual 
addresses. .Some advantages nf using registers in pairs with a current and 
alternate apply in l> and I. mode, in language implementations other than Mesa. 
In Mesa address translation faults, are assumed to occur only for the two 
reasons given above because all frames for activations cira assumed to fit into 
a page group. In Smalltalk this- will not be the case for instances, since 
these effectively form the fi*(>e storage pool. Consider transferring control 
back and forth between two instances P and Q. The code bases for V and Q will 
be handled properly and efficiently, as discussed in (3.2). The instance 
pointers for V and Q can he handled analogously. Let I be a pair of base 
registers used for instances. A micro-instruction simply specifies D-modc -ef 
■ L' nu i i ' l r . .references relative tn 1. Such references are mapped by I-current. 
Trying I-nlternate is meaningless in this case. However, when changing to a 
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new instance, it is necessary to change the interpretation of I. Proceeding as 
in (3.?.), consider testing 1-alternato to see if it docs, in fact, map the new 
instance base and changing the state of the I-pair if it docs, so that the 
former I-altdrnate becomes the new I-current. In this way, the micro- 
instruction still specifies only I, while state information associated with I- 
pair distinguishes which of the two I-rcgistors is intended. 

(3.4) Fallback to Non-Paired Organization 

For maximum flexibility in base register usage, it seems desirable to 
allow optional usage of the 10 base registers individually. 

A memory request is specified by an 0-bit specifier , broken down as 
follows: 

4-bit C field - giving partial specification of the base register 
2-bit mode field - D-inode, F.-mode, etc. 
2-bit usage class - Road, Write, etc. 

(4.1) Addressing Modes 

The memory interface can bo used in 4-modos: 

D - displacement from.P(B) 

E - displacement from CL(B) 

F - 24-bit virtual address to be found in the page group described by B 

G - 24-bit virtual address to be found in the page group described by B or B's 

alternate. 

(4.2) Specification of Base Register Number 

In each case, B is specified in the following way. The micro-instruction 
supplies a 4-bit C field. C.|.0:2] specifies one of 8 base register pairs. Let 
S(C[0:2]) be the state bit of that pair. Then 

Br 0:2] = C[0:2] 

B[3] = C[3] xor S(C[0:2]) 

This allows cither paired or non-paired usage of the base registers, as 
follows : . 

(1) to obtain paired-addressing -- set the low order bit of C to zero when 
assembling the micro-instruction 

(2) to obtain non-paired addressing - set the status bit of a pair to zero 
when loading either base register of the- pair. 

In modes B, F. , and F, only B is used in translation. In mode G, B is 
tried and if it fails then B's alternate is formed by complementing S(B[0:2]) 
and trying the (new) resultant B. Hence, in G-modo-, the memory interface 
reports a fault only if B and B's alternate both fail. Further, in G-mode, if 
a fault does not occur then the new state of the memory interface is such that 
the current element or the pair is the one which succeeded. The state bits to 
keep track of the current element of each pair are stored in a special 1x8 
memory. 

(4.3) Usage Classes: 

Orthogonal to the four modes are four usage classes: 
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R - read momory ft road-protect selected 

W - write memory & wri to-prote.c.t selected 

E - read memory ft execute-protect selected 

T - form MAR hut do not run the memory 

The first three nro obvious. The fourth is used for two purposes: 

(1) implementing (3.2) and (3.3) above without initiating and hence waiting 
for the unneeded memory cycle 

{?.) changing the value of a base pointer P within a page group, e.g. in moving 
the frame pointer down on the stack for a simple hierarchical procedure call. 

JLl perfo rmance 

Simulations of Mesa have been run under the previous and the (new) proposed use 
of paired-base registers. In the paired, simulations, pairs were used for read, 
write, and code. The resulting fault rates are as follows for three programs 
(several million instruction in each case): 

pre viou s new 

compiler: 4.G7. 2.1'/. 

text formatter: 2.37. 0.17. 

nnnlyzor/compi ler: 8.1% 2.37. 

From statistics previously issued on BCPI., it is possible to compute the affect 
in one experiment of using paired base registers for user-computed addresses 
cuUv, with a single base register for control. 

previous new 

experiment 01 4.9% 3.1% 

(The effect of paired base registers for control and the effect on the other 
two experiments can't be computed from the previously issued data.) 



